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Abstract — It is well known that £i minimization can be used 
to recover sufficiently sparse unknown signals from compressed 
linear measurements. In fact, exact thresholds on the sparsity, as 
a function of the ratio between the system dimensions, so that 
with high probability almost all sparse signals can be recovered 
from iid Gaussian measurements, have been computed and are 
referred to as "weak thresholds" |4|. In this paper, we introduce 
a reweighted £i recovery algorithm composed of two steps: a 
standard £i minimization step to identify a set of entries where 
the signal is likely to reside, and a weighted £i minimization step 
where entries outside this set are penalized. For signals where 
the non-sparse component has iid Gaussian entries, we prove a 
"strict" improvement in the weak recovery threshold. Simulations 
suggest that the improvement can be quite impressive — over 20% 
in the example we consider. 

1. Introduction 

Compressed sensing addresses the problem of recovering 
sparse signals from under-determined systems of linear equa- 
tions 11121 . In particular, if x is a n x 1 real vector that is 
known to have at most k nonzero elements where k < n, 
and A is a TO X n measurement matrix with k < m < n, 
then for appropriate values of fc, m and n, it is possible to 
efficiently recover x from y — Ax HI, ||2l, El, |^. The most 
well recognized such algorithm is £i minimization which can 
be formulated as follows: 

min ||z||i (1) 

Az=Ax 

The first result that established the fundamental limits of signal 
recovery using £i minimization is due to Donoho and Tanner 
El, IS, where it is shown that if the measurement matrix is 
rid Gaussian, for a given ratio of S — —, £i minimization 
can successfully recover every fc-sparse signal, provided that 
n = ^ is smaller that a certain threshold. This statement is 

' n 

true asymptotically as ?i — !■ cx) and with high probability. This 
threshold guarantees the recovery of all sufficiently sparse 
signals and is therefore referred to as a strong threshold. It 
therefore does not depend on the actual distribution of the 
nonzero entries of the sparse signal and as such is a universal 
result. At this point it is not known whether there exists other 
polynomial-time algorithms with superior strong threshold. 

Another notion introduced and computed in |4|, [2| is that 
of a weak threshold where signal recovery is guaranteed for 
almost all support sets and almost all sign patterns of the 
sparse signal, with high probability as n — s- oo. The weak 



threshold is the one that can be observed in simulations of £i 
minimization and allows for signal recovery beyond the strong 
threshold. It is also universal in the sense that it applies to all 
symmetric distributions that one may draw the nonzero signal 
entries from. Finally, it is not known whether there exists other 
polynomial-time algorithms with superior weak thresholds. 

In this paper we prove that a certain iterative reweighted 
£\ algorithm indeed has better weak recovery guarantees for 
particular classes of sparse signals, including sparse Gaussian 
signals. We had previously introduced these algorithms in 
1 11], and had proven that for a very restricted class of 
polynomially decaying sparse signals they outperform standard 
£i minimization. In this paper however, we extend this result 
to a much wider and more reasonable class of sparse signals. 
The key to our result is the fact that for these classes of 
signals, £i minimization has an approximate support recovery 
property which can be exploited via a reweighted £i algorithm, 
to obtain a provably superior weak threshold. In particular, we 
consider Gaussian sparse signals, namely sparse signals where 
the nonzero entries are iid Gaussian. Our analysis of Gaussian 
sparse signals relies on concentration bounds on the partial 
sum of their order statistics. Though not done here, it can be 
shown that for symmetric distributions with sufficiently fast 
decaying tails and nonzero value at the origin, similar bounds 
and improvements on the weak threshold can be achieved. 

It is worth noting that different variations of reweighted 
£i algorithms have been recently introduced in the literature 
and, have shown experimental improvement over ordinary £i 
minimization ifTOl . Q. In Q approximately sparse signals 
have been considered, where perfect recovery is never possi- 
ble. However, it has been shown that the recovery noise can be 
reduced using an iterative scheme. In |10|, a similar algorithm 
is suggested and is empirically shown to outperform £i mini- 
mization for exactly sparse signals with non-flat distributions. 
Unfortunately, ifTol provides no theoretical analysis or perfor- 
mance guarantee. The particular reweighted £i minimization 
algorithm that we propose and analyze is of signiciantly 
less computational complexity than the earlier ones (it only 
solves two linear programs). Furthermore, experimental results 
confirm that it exhibits much better performance than previous 
reweighted methods. Finally, while we do rigorously establish 
an improvement in the weak threshold, we currently do not 
have tight bounds on the new weak threshold and simulation 



results are far better than the bounds we can provide at this 
time. 

II. Basic Definitions 

A sparse signal with exactly k nonzero entries is called k- 
sparse. For a vector x, ||x||i denotes the £i norm. The support 
(set) of X, denoted by supp{x.), is the index set of its nonzero 
coordinates. For a vector x that is not exactly /c-sparse, we 
define the /c-support of x to be the index set of the largest k 
entries of x in amplitude, and denote it by suppk{x). For a 
subset K of the entries of x, xa' means the vector formed by 
those entries of x indexed in K. Finally, max |x| and min |x| 
mean the absolute value of the maximum and minimum entry 
of X in magnitude, respectively. 

III. Signal Model and Problem Description 

We consider sparse random signals with iid Gaussian 
nonzero entries. In other words we assume that the unknown 
sparse signal is a n x 1 vector x with exactly k nonzero entries, 
where each nonzero entry is independently derived from the 
standard normal distribution J\f{0, 1). The measurement matrix 
A is a m X n matrix with iid Gaussian entries with a ratio 
of dimensions S = —. Compressed sensing theory guarantees 
that if /I = - is smaller than a certain threshold, then every 
/c-sparse signal can be recovered using £i minimization. The 
relationship between S and the maximum threshold of fi for 
which such a guarantee exists is called the strong sparsity 
threshold, and is denoted by usiS). A more practical per- 
formance guarantee is the so-called weak sparsity threshold, 
denoted by /ivi/((5), and has the following interpretation. For 
a fixed value of S ~ — and iid Gaussian matrix A of size 

n 

m X n, a random fc-sparse vector x of size n x 1 with 
a randomly chosen support set and a random sign pattern 
can be recovered from Ax using ii minimization with high 
probability, if < fJ,w{^)- Similar recovery thresholds can be 
obtained by imposing more or less restrictions. For example, 
strong and weak thresholds for nonnegative signals have been 
evaluated in |6|. 

We assume that the support size of x, namely k, is slightly 
larger than the weak threshold of £i minimization. In other 
words, k = {1 + eo)iJ,w{^) for some eo > 0. This means that 
if we use £i minimization, a randomly chosen /ivi/((5)n-sparse 
signal will be recovered perfectly with very high probability, 
whereas a randomly selected fc-sparse signal will not. We 
would like to show that for a strictly positive eg, the iterative 
reweighted £i algorithm of Section |IV] can indeed recover a 
randomly selected fc-sparse signal with high probability, which 
means that it has an improved weak threshold. 

IV. Iterative weighted £i Algorithm 

We propose the following algorithm, consisting of two £i 
minimization steps: a standard one and a weighted one. The 
input to the algorithm is the vector y = Ax, where x is a 
fc-sparse signal with fc = (1 + eo)iJ,w{5)n,, and the output is 
an approximation x* to the unknown vector x. We assume 
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Fig. 1; A pictorial example of a sparse signal and its £i minimization approximation 

that fc, or an upper bound on it, is known. Also a; > 1 is a 
predetermined weight. 

Algorithm 1. 

1) Solve the £i minimization problem: 

X = argmin ||z||i subject to Az = Ax. (2) 

2) Obtain an approximation for the support set of x.- find 
the index set L C {1, 2, n} which corresponds to the 
largest k elements of 'k in magnitude. 

3) Solve the following weighted £i minimization problem 
and declare the solution as output: 

X* = argmin ||zl||i +w||zjj||i subject to Az — Ax. 

(3) 

The intuition behind the algorithm should be clear. In the 
first step we perform a standard £i minimization. If the sparsity 
of the signal is beyond the weak threshold pw{^)n, then £i 
minimization is not capable of recovering the signal. However, 
we use the output of the £i minimization to identify an index 
set, L, which we "hope" contains most of the nonzero entries 
of X. We finally perform a weighted £i minimization by 
penalizing those entries of x that are not in L (ostensibly 
because they have a lower chance of being nonzero). 

In the next sections we formally prove that the above 
intuition is correct and that, for certain classes of signals. 
Algorithm [T] has a recovery threshold beyond that of standard 
£i minimization. The idea of the proof is as follows. In section 
rvl we prove that there is a large overlap between the index 
set L, found in step 2 of the algorithm, and the support set of 
the unknown signal x (denoted by K) — see Theorem [T] and 
Figure [T] Then in section IVII we show that the large overlap 
between K and L can result in perfect recovery of x, beyond 
the standard weak threshold, when a weighted £i minimization 
is used in step 3. 

V. Approximate Support Recovery, Steps 1 and 2 of 
the Algorithm 

In this Section, we carefully study the first two steps of 
Algorithm [T] The unknown signal x is assumed to be a 
Gaussian fc-sparse vector with support set K, where fc = 
\K\ = (1 + eo)pw{S)'n, for some eg > 0. By A Gaussian 
fc-sparse vector, we mean one where the nonzero entries are 
iid Gaussian (zero mean and unit variance, say). The solution 
X to the £i minimization obtained in step 1 of Algorithm [T] 



is in all likelihood a full vector. The set L, as defined in the 
algorithm, is in fact the fc-support set of x. We show that for 
small enough eo, the intersection of L and K is with high 
probabiUty very large, so that L can be counted as a good 
approximation to K (Figure [U. 

In order to find a decent lower bound on |LnA'|, we mention 
three separate facts and establish a connection between them. 
First, we prove a general lemma that bounds jii' n i| as a 
function of ||x — x||i. Then, we mention an intrinsic property 
of £i minimization called weak robustness that provides an 
upper bound on the quantity ||x — x||i. Finally, we specifically 
use the Gaussianity of x to obtain Theorem[T] Let us start with 
a definition. 

Definition 1. For a k-sparse signal x, we define M^(x, A) to 
be the size of the largest subset of nonzero entries of x that 
has a £i norm less than or equal to A. 

VF(x, A) := max{|5| | S* C sMpp(x), ||xs||i < A} 

Note that VF(x, A) is increasing in A. 

Lemma 1. Let ii.be a k-sparse vector and x be another vector 
Also, let K be the support set of x and L be the k-support 
set of X. Then 



\Kr\L\> fc - VK(x, ||x-x||i) 



(4) 



Proof Let Xi be the zth entry of x and e* be the solution 
to the following minimization program 

min ||e||i 

r max|(x + e)if\i| < min|(x + e)KnL| 
■ ' \ max |(x + e)A'\L| < min |(x + e)L\K\ 

Now since, x = x + (x — x) satisfies the constraints of the 
minimization (|5]l, we can write 



< llx-xll 



(6) 



Let a — max |(x + e*)^\i|. Then for each i ^ K \ L, using 
the triangular inequality we have 



\xi\ - \ei\ < \xi +ei\ < a 



(7) 



Therefore, by summing up the inequalities in (|7]i for i E K\L 
we have 

i£K\L,\xi\>a 

Similarly, 

llel^^lli >a|i\if| (9) 
But |i \ A'l = \K \ L\ and therefore we have 

||e*||i > i\x^\-a) + a\K\L\ 

ieK\L,\xi\>a 
iGK\L 

^ and ([Tol l together imply that ||x — x||i > ||xx\i[|i, which 
by definition means that VF(x, ||x — x||i) > \K \ L\. ■ 



We now introduce the notion of weak robustness, which 
allows us to bound ||x — x||i, and has the following formal 
definition |9|. 

Definition 2. Let the set S C {1, 2, • ■ • , n} and the subvector 
X5 be fixed. A solution x is called weakly robust if for some 
C > 1 called the robustness factor, and all x^, it holds that 

Of 

(x-x)5||i<7^||x^||i (11) 



and 



I|X5||- 11*511 < 



C- 1' 



'sill 



(12) 



The weak robustness notion allows us to bound the error in 
||x — x||i in the following way. If the matrix A5 , obtained 
by retaining only those columns of A that are indexed by S, 
has full column rank, then the quantity 

l|ws||i 



(13) 



Aw=n,vr^O ||Wg||i 

must be finite, and one can write 

II "II <-Ml±^ll II 
||x-x||i < llxglli 

In f9\, it has been shown that for Gaussian iid measurement 
matrices A, £1 minimization is weakly robust, i.e., there exists 
a robustness factor C > 1 as a function of < fiwi^) for 
which (fTTT l and ( fT2] l hold. Now let fci = (1 - ei)^w{S)n for 
some small ei > 0, and Ki be the fci-support set of x, namely, 
the set of the largest ki entries of x in magnitude. Based on 
equation (T3[ we may write 



X 1 < — ^ -^llx 



(14) 

For a fixed value of 5, C in (fT4l) is a function of ei and 
becomes arbitrarily close to 1 as ei — > 0. k is also a bounded 
function of ei and therefore we may replace it with an upper 
bound K* . We now have a bound on ||x — x||i. To explore this 
inequality and understand its asymptotic behavior, we apply 
a third result, which is a certain concentration bound on the 
order statistics of Gaussian random variables. 

Lemma 2. Suppose Xi, X2, • • • , Xn are N iid Af(0, 1) ran- 
dom variables. Let Sn = X]t=i l"'^*! '^^'^ ^'^^ ^'^^ 
the largest M numbers among the \Xi\, for each 1 < M < N. 
Then for every e > 0, as N ^ 00, we have 



17 



P(|. 



exp(- 



^2/ A£^ 



I > e) ^ 0, 
.)| >e)^0 



Sn 2 
where '^(x) = Q~^(x) with Q(x) 
As a direct consequence of Lemma |2] we can write: 



(15) 
(16) 



/27r Jx 



-dy. 



(1 - e ^ i+'o ')| > e) 



^ (17) 



for all e > as n — > oo. Define 



inf 

ei>0 



2C(1 



-(1 



-0.5*^(0.54^) 



C- 1 

Incorporating ( fT4] i into ( fTTI l we may write 



0>) 



l|x||i 



C(eo) < e) 



1 



(18) 



for all e > as n — > oo. Let us summarize our conclusions so 
far. First, we were able to show that |i^nL| > k — W{x,\\:x.— 
x|ji). Weak robustness of £i minimization and Gaussianity of 
the signal then led us to the fact that for large n with high 
probability ||x — x||i < C(eo)||x||i. These results build up the 
next key theorem, which is the conclusion of this section. 

Theorem 1 (Support Recovery). Let A be an iid Gaussian 
mxn measurement matrix with ^ — S. Let k = (l+eo)/Jw('5) 
and X be a n X 1 random Gaussian k-sparse signal. Suppose 
that X is the approximation to x given by the £i minimization, 
i.e. X — ar(7mmAz=Ax|Iz||i. Then, as n oo, for all e > 0, 

™^ l-PP(x) n supp.iM _ 2Q(V -21og(l-C(^o)) ) > ^ 1. 

(19) 



k 



Proof: For each e' > and large enough n, with high 
probability it holds that ||x - x||i < (C(eo) + e')l|x||i. 
Therefore, from Lemma [T] and the fact that W{x, A) is 
increasing in A, \K n L\ > k - W^(x, (C(eo) + e')l|x||i) 
with high probability. Also, an implication of Lemma |2] 



reveals that for any positive e" and a, '■^^'^"^"i-' < (i — 



k 

2Q(^-21og(l - a))) + e" for large enough n. Putting these 
together, we conclude that with very high probability ^^'i^^^ > 
2Q{y/-2 log(l - C(eo) - e')) - e"- The desired result now 
follows from the continuity of the log( ) and Q(-) functions. 

■ 

Note that if lime(,_j.o C(eo) — 0, then Theorem [T] implies that 
^'^^'^^ becomes arbitrarily close to 1. 

VI. Perfect Recovery, Step 3 of the Algorithm 

In Section |V] we showed that, if eo is small, the fc-support 
of X, namely L = suppk{x), has a significant overlap with the 
true support of x. We even found a quantitative lower bound 
on the size of this overlap in Theorem[T] In step 3 of Algorithm 
[T] weighted £i minimization is used, where the entries in L 
are assigned a higher weight than those in L. In |8|, we have 
been able to analyze the performance of such weighted £i 
minimization algorithms. The idea is that if a sparse vector x 
can be partitioned into two sets L and L, where in one set 
the fraction of non-zeros is much larger than in the other set, 
then (|3]l can potentially recover x with an appropriate choice 
of the weight w > 1, even though £i minimization cannot. 
The following theorem can be deduced from |8|. 

Theorem 2. Let L C {1, 2, • • • , n} , a; > 1 and the fractions 
/i,/2 e [0,1] be given. Let 71 = 1^ and 72 = 1 - 71. 
There exists a threshold (5c(7i, 72, /i, /2, such that with 
high probability, almost all random sparse vectors x with at 



Fig. 2: An approximate upper bound for C(^o) foi" ^ — 0.555 . 



least /i7in nonzero entries over the set L, and at most f2j2'n 
nonzero entries over the set L can be perfectly recovered using 
minAz= Ax llzLlli+t^llz-jjIli, where A is a 6cnxn matrix with 
iid Gaussian entries. 
Furthermore, for appropriate uj, 

Pw{5c{ll,l2,flj2,Uj)) < fill +/272, 

i.e., standard £i minimization using a Sen x n measurement 
matrix with iid Gaussian entries cannot recover such x. 

For completeness, in Appendix lAl we provide the calculation 
of (5c(7i, 72, /i, /2, '^). A software package for computing 
such thresholds can also be found in |[T3l . 

Theorem 3 (Perfect Recovery). Let A be a m x n i.i.d. 

Gaussian matrix with — = 6. If limeQ_j.o C(eo) — and 
Scip-wiS),! — pw{S), 1,0,10) < S, then there exist eo > 
and CO > so that Algorithm Q] perfectly recovers a random 
(1 + eQ)^\Y{5)-sparse vector with i.i.d. Gaussian entries with 
high probability as n grows to infinity. 

Proof Leveraging on the statement of Theorem |2l in 
order to show that x is perfectly recovered in the last 
step of the algorithm, , it is sufficient to find the overlap 
fractions /i = /2 = ^^'^^ for ^ given eo, 

and show that ^c(f,l - ^,fi,f2,i^) < S. On the other 
hand, according to Theorem [T] as eo ^> 0, /i — > 1 and 
/2 0. Therefore, if 6c{pw{S),l — pw{S),l,0,uj) < S, 
from the continuity of Sc we can conclude that for a strictly 
positive eo and corresponding overlap fractions /i and /2, 
SciC^ + eo)fiw{S),l - (1 + eo)fiw{S),fi,f2,uj) < S, which 
completes the proof. ■ 
For S = 0.555 it is easy to verify numerically that the 
conditions of Theorem [3] hold. We haven chosen a — 1 and 
have computed an approximate upper bound C*(eo) for Ci^o), 
using the results of (9^]. This is depicted in Figure |2l As shown, 
when Co 0, C*(eo) becomes arbitrarily small too. Using this 
curve and the numerical Sc function from Appendix lAl we can 
show that for oj — 10, the value of eo = 5 x 10^^ satisfies the 
statement of Theorem [3] This improvement is of course much 
smaller than what we observe in practice. 

VII. Beyond Gaussians and Simulations 

It is reasonable to ask if we can prove a theoretical threshold 
improvement for sparse signals with other distributions. The 
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Fig. 3: Empirical Recovery Percentage for n — 200 and 5 — 0.5555. 



attentive reader will note that the only step where we used the 
Gaussianity of the signal was in the the order statistics results 
of Lemma |2] This result has the following interpretation. For 
lid random variables, the ratio ^ can be approximated by 
a known function of In the Gaussian case, this function 
as M 



behaves as (1 — as M ^ TV. For constant magnitude 

signals (say BPSK), the function behaves as 1 — for 
M ^ N, which proves that the reweighted method yields 
no improvement. A more careful analysis, beyond the scope 
and space of this paper, reveals that the improvement over £i 
minimization depends on the behavior of as M N, 
which in term depends on the smallest order n for which 
j(")(0) ^ 0, i.e., the smallest n such that the n-th derivative 
of the distribution at the origin is nonzero. 

These are exemplified by the simulations in Figure |3] 
Here the signal dimension is n = 200, and the number of 
measurements is m = 112, which corresponds to a value of 
6 — 0.5555. We generated random sparse signals with iid 
entries coming from certain distributions; Gaussian, uniform, 
Rayleigh , square root of x-square with 4 degrees of freedom 
and, square root of x-square with 6 degrees of freedom. Solid 
lines represent the simulation results for ordinary £i mini- 
mization, and different colors indicate different distributions. 
Dashed lines are used to show the results for Algorithm [T] 
Note that the more derivatives that vanish at the origin, the 
less the improvement over £i minmimization. The Gaussian 
and uniform distributions are flat and nonzero at the origin 
and show an impressive more than 20% improvement in the 
weak threshold (from 45 to 55). 
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APPENDIX 

A. Computation of 5c Threshold 

The following formulas for (5c(7i, 72, /i, /2,w) are given in 
18|. 

5c =min{(5 | l/'com(Tl, T2) - ?/'mt(Tl,T2) - V'ea;* (ti , T2 ) < 

VO<Ti <7i(l-/i),0<r2 < 72(1-/2), 

Tl+T2> 5 - 71/1 - 72/2} 



where ipcom, 4'int and ^ext are obtained as follows. Define 
g{x) = ^e'-\ G{x) ^ ^ e'v" dy and let ^(.) and $(.) 
be the standard Gaussian pdf and cdf functions respectively. 

V'com(n,T-2) = (ti +T2 +7l(l ^ h)H{ _ j. J 

T2 



+ l2{l-f2)H{ 



72(1 - h) 



7i(l-/i)' 
)+7i^(/i) + 72i^(/2))log2 

(20) 



where H{-) is the Shannon entropy function. Define c— {ti + 

71/1) + ^"^{ti + 72/2), ai = 71(1 - /i) - Ti and a2 = 
72(1 — 72) — T2. Let xq be the unique solution to x of the 
equation 2c - ^ - = 0. Then 

■>Pext{Ti,T2) = cxl - tti \ogG{xo) ~ a2\ogG{ujxo) (21) 

Let b = i^i^, n' - 71/1 + c^2^2/2 and Q{s) = 
riy(s) ^ u^r^^iu^s) Define the function M(s) = -of^ 



T1+T2 



. Let the unique 
Compute the rate 



(ri+r2)*(s) (Ti+r2)*(tJs) 

and solve for s in Mis, — -, — , — , „ 
solution be s* and set y = s*(b ' ). 

^ ^ M(s')' 

function A*{y) = sy — ^^"^^^ Ai (s) - ^^'^^^ Ai (ojs) at the point 

s — s* , where Ai(s) = ^ + log(2$(s)). The internal angle 
exponent is then given by: 



?/'mt(n,T2) = (A*(?/) + 



2^ 



T2 2 

—y 



log2)(Ti+r2) (22) 



